---
sidebar_class_name: hidden
hide_table_of_contents: true
---

# Conversational Retrieval QA

:::info
Looking for the LCEL version? Click [here](/docs/modules/chains/popular/chat_vector_db).
:::

The ConversationalRetrievalQA chain builds on RetrievalQAChain to provide a chat history component.

It first combines the chat history (either explicitly passed in or retrieved from the provided memory) and the question into a standalone question, then looks up relevant documents from the retriever, and finally passes those documents and the question to a question answering chain to return a response.

To create one, you will need a retriever. In the below example, we will create one from a vector store, which can be created from embeddings.

import CodeBlock from "@theme/CodeBlock";
import ConvoRetrievalQAExample from "@examples/chains/conversational_qa_legacy.ts";

<CodeBlock language="typescript">{ConvoRetrievalQAExample}</CodeBlock>

In the above code snippet, the fromLLM method of the `ConversationalRetrievalQAChain` class has the following signature:

```typescript
static fromLLM(
  llm: BaseLanguageModelInterface,
  retriever: BaseRetrieverInterface,
  options?: {
    questionGeneratorChainOptions?: {
      llm?: BaseLanguageModelInterface;
      template?: string;
    };
    qaChainOptions?: QAChainParams;
    returnSourceDocuments?: boolean;
  }
): ConversationalRetrievalQAChain
```

Here's an explanation of each of the attributes of the options object:

- `questionGeneratorChainOptions`: An object that allows you to pass a custom template and LLM to the underlying question generation chain.
  - If the template is provided, the `ConversationalRetrievalQAChain` will use this template to generate a question from the conversation context instead of using the question provided in the question parameter.
  - Passing in a separate LLM (`llm`) here allows you to use a cheaper/faster model to create the condensed question while using a more powerful model for the final response, and can reduce unnecessary latency.
- `qaChainOptions`: Options that allow you to customize the specific QA chain used in the final step. The default is the [`StuffDocumentsChain`](/docs/modules/chains/document/stuff), but you can customize which chain is used by passing in a `type` parameter.
  **Passing specific options here is completely optional**, but can be useful if you want to customize the way the response is presented to the end user, or if you have too many documents for the default `StuffDocumentsChain`.
  You can see [the API reference of the usable fields here](https://api.js.langchain.com/types/langchain_chains.QAChainParams.html). In case you want to make chat_history available to the final answering `qaChain`, which ultimately answers the user question, you HAVE to pass a custom qaTemplate with chat_history as input, as it is not present in the default Template, which only gets passed `context` documents and generated `question`.
- `returnSourceDocuments`: A boolean value that indicates whether the `ConversationalRetrievalQAChain` should return the source documents that were used to retrieve the answer. If set to true, the documents will be included in the result returned by the call() method. This can be useful if you want to allow the user to see the sources used to generate the answer. If not set, the default value will be false.
  - If you are using this option and passing in a memory instance, set `inputKey` and `outputKey` on the memory instance to the same values as the chain input and final conversational chain output. These default to `"question"` and `"text"` respectively, and specify the values that the memory should store.

## Built-in Memory

Here's a customization example using a faster LLM to generate questions and a slower, more comprehensive LLM for the final answer. It uses a built-in memory object and returns the referenced source documents.
Because we have `returnSourceDocuments` set and are thus returning multiple values from the chain, we must set `inputKey` and `outputKey` on the memory instance
to let it know which values to store.

import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";

<IntegrationInstallTooltip></IntegrationInstallTooltip>

```bash npm2yarn
npm install @langchain/openai @langchain/community
```

import ConvoQABuiltInExample from "@examples/chains/conversational_qa_built_in_memory_legacy.ts";

<CodeBlock language="typescript">{ConvoQABuiltInExample}</CodeBlock>

## Streaming

You can also use the above concept of using two different LLMs to stream only the final response from the chain, and not output from the intermediate standalone question generation step. Here's an example:

import ConvoQAStreamingExample from "@examples/chains/conversational_qa_streaming_legacy.ts";

<CodeBlock language="typescript">{ConvoQAStreamingExample}</CodeBlock>

## Externally-Managed Memory

For this chain, if you'd like to format the chat history in a custom way (or pass in chat messages directly for convenience), you can also pass the chat history in explicitly by omitting the `memory` option and supplying
a `chat_history` string or array of [HumanMessages](https://api.js.langchain.com/classes/langchain_core_messages.HumanMessage.html) and [AIMessages](https://api.js.langchain.com/classes/langchain_core_messages.AIMessage.html) directly into the `chain.call` method:

import ConvoQAExternalMemoryExample from "@examples/chains/conversational_qa_external_memory_legacy.ts";

<CodeBlock language="typescript">{ConvoQAExternalMemoryExample}</CodeBlock>

## Prompt Customization

If you want to further change the chain's behavior, you can change the prompts for both the underlying question generation chain and the QA chain.

One case where you might want to do this is to improve the chain's ability to answer meta questions about the chat history.
By default, the only input to the QA chain is the standalone question generated from the question generation chain.
This poses a challenge when asking meta questions about information in previous interactions from the chat history.

For example, if you introduce a friend Bob and mention his age as 28, the chain is unable to provide his age upon asking a question like "How old is Bob?".
This limitation occurs because the bot searches for Bob in the vector store, rather than considering the message history.

You can pass an alternative prompt for the question generation chain that also returns parts of the chat history relevant to the answer,
allowing the QA chain to answer meta questions with the additional context:

import ConvoRetrievalQAWithCustomPrompt from "@examples/chains/conversation_qa_custom_prompt_legacy.ts";

<CodeBlock language="typescript">{ConvoRetrievalQAWithCustomPrompt}</CodeBlock>

Keep in mind that adding more context to the prompt in this way may distract the LLM from other relevant retrieved information.
